CSAR: Cluster Storage with Adaptive Redundancy
نویسندگان
چکیده
Striped file systems such as the Parallel Virtual File System (PVFS) deliver high-bandwidth I/O to applications running on clusters. An open problem of existing striped file systems is how to provide efficient data redundancy to decrease their vulnerability to disk failures. In this paper we describe CSAR, a version of PVFS augmented with a novel redundancy scheme that addresses the efficiency issue while using unmodified stock file systems. By dynamically switching between RAID1 and RAID5 redundancy based on write size, CSAR achieves RAID1 performance on small writes, and RAID5 efficiency on large writes. On a microbenchmark, our scheme achieves identical read bandwidth and 73% of the write bandwidth of PVFS over 7 I/O nodes. We describe the issues in implementing our new scheme in a popular striped file system such as PVFS on a Linux cluster with a high performance I/O subsystem.
منابع مشابه
CSAR-2: a Case Study of Parallel File System Dependability
Modern cluster file systems such as PVFS that stripe files across multiple nodes have shown to provide high aggregate I/O bandwidth but are prone to data loss since the failure of a single disk or server affects the whole file system. To address this problem a number of distributed data redundancy schemes have been proposed that represent different trade-offs between performance, storage effici...
متن کاملCSAR-2: A Case Study of Parallel File System Dependability Analysis
Modern cluster file systems such as PVFS that stripe files across multiple nodes have shown to provide high aggregate I/O bandwidth but are prone to data loss since the failure of a single disk or server affects the whole file system. To address this problem a number of distributed data redundancy schemes have been proposed that represent different trade-offs between performance, storage effici...
متن کاملA High Performance Redundancy Scheme for Cluster File Systems
A known problem in the design of striped file systems is their vulnerability to disk failures. In this paper we address the challenges of augmenting an existing file system with traditional RAID redundancy, and we propose a novel hybrid redundancy scheme designed to maximize disk throughput as seen by the applications. To demonstrate the hybrid redundancy scheme we build CSAR, a proof-of-concep...
متن کاملA Dynamic Deduplication Approach for Big Data Storage
As data is increasing every day, so it is very challenging task to manage storage devices for this explosive growth of digital data. Data reduction has become very crucial problem. Deduplication approach plays a vital role to remove redundancy in large scale cluster computing storage. As a result, deduplication provides better storage utilization by eliminating redundant copies of data and savi...
متن کاملVisualization of Time-Dependent Adaptive Mesh Refinement Data
Analysis of phenomena that simultaneously occur on quite different spatial and temporal scales require adaptive, hierarchical schemes to reduce computational and storage demands. For data represented as grid functions, the key are adaptive, hierarchical, time-dependent grids that resolve spatio-temporal details without too much redundancy. Here, so-called AMR grids gain increasing popularity. F...
متن کامل